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Thomas  Piketty 
Economics  Dept.  •  MIT 
Cambridge,  MA  02139,  USA 


Abstract  :  Assume  agents  differ  in  their  beliefs  concerning  the  relative  importance  of 
individual  effort  and  social  rigidities  in  shaping  individual  achievements,  so  that  they  trade 
off  differently  the  social  benefits  of  equalizing  opportunities  with  the  incentive  costs  of 
taxation  (just  as  economists).  Through  their  individual  mobility  experience  they  are  exposed 
to  different  signals  regarding  these  structural  parameters,  and  in  the  long-run  "left-wing 
dynasties"  believing  less  in  individual  effort  and  voting  for  more  redistribution  coexist  with 
"right-wing  dynasties".  Thus  we  are  able  to  explain  (1)  why  rich  and  poor  claim  the  same 
abstract  principles  of  distributive  justice  but  vote  differently,  (2)  why  social  origins  and  not 
only  current  income  play  a  crucial  but  (mostly)  indirect  role  in  shaping  one's  political 
attitudes,  (3)  how  persistent  differences  in  popular  beliefs  about  social  mobility  and  the 
need  for  redistribution  can  sustain  although  the  underlying,  "true"  mobility  rates  are 
essentially  the  same  (US  vs  Europe),  (4)  why  individual  countries  tend  to  be  politically 
homogeneous. 


*I  am  grateful  to  seminar  participants  at  MIT,  Harvard  (Economics  Dept.  and  Kennedy 
School),  Columbia  and  Boston  University  for  their  comments. 


Section  1  :  Introduction. 

This  paper  deveilops  a  rational-choice  theory^  of  redistributive  politics  seeking  to  explain 
important  stylised  facts  concerning  the  effect  of  social  mobility  both  on  individual  political 
attitudes  and  aggregate  political  outcomes. 

The  idea  that  social  mobility  plays  a  crucial  role  in  shaping  political  attitudes  (in 
particular  towards  redistribution)  has  a  long  history  in  the  social  sciences. 
Tocqueville(1835)  first  stressed  the  idea  that  the  difference  in  attitudes  toward 
redistribution  between  Europe  and  the  United  States  could  be  explained  by  presumed 
differences  in  mobility  rates.  Since  then,  many  authors  have  followed  this  line  to  explain 
the  absence  of  any  strong  socialist  movement  in  the  US,  among  which  Marx(1852), 
Sombart(1906)  and  Petersen (1953).  On  the  other  hand  comparative  empirical  studies  of 
social  mobility  rates  have  long  demonstrated  the  absence  of  any  significant  difference 
between  industrial  nations  (see,  e.g.,  Lipset-Bendix(1959),  Erikson-Goldthorpe(1985, 1992)). 
Lipset-Bendix(1959)  and  Lipset(1966,1977,1992)  have  repeatedly  suggested  that  persistent 
differences  between  European  and  US  redistributive  politics  may  be  due  to  persistent 
differences  in  popular  beliefs  about  social  mobility^. 


'That  is,  as  we  understand  it,  a  theory  describing  precisely  the  values  and  preferences 
individuals  are  promoting,  the  information  sets  they  are  exposed  to  and  the  institutions 
aggregating  their  actions.  This  difTers  from  most  sociological  "explanations"  of  the  effect  of 
one's  mobility  experience  on  one's  political  attitudes. 

^  "What  explains  the  contrast  in  the  political  values  and  allegiances  of  American 

.  workers  with  those  of  other  democratic  nations?  (...)  the  belief  system  concerning  class 

rigidities  stemming  from  varying  historical  experiences  (...)  seems  much  more  important 

than  slight  variations  in  rates  of  mobility".   [Lipset(1992,  pp.xx-xxi)].  Regarding  the 

presumed  lesser  importance  of  government  as  a  redistributor  of  income  in  the  US  as 
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But  social  mobility  is  known  to  have  crucial  effects  at  the  individual  level  as  well. 

Although  current  income  is  positively  correlated  with  voting  attitudes  toward  redistribution 

(higher-income  groups  vote  less  for  left-wing  redistributive  policies),  the  correlation  is  much 

less  than  one,  and  upwardly-  or  donwardly-mobile  voters  always  exhibit  an  intermediate 

position  between  stable  low-income  and  high-income  voters^;  that  is,  the  following  table 

summarizes  the  typical  voting  patterns  observed  across  time  and  industrial  democraties 

(see,  e.g.,  Abramson(1973),  Thomson(1971),  Boy(1980),  Cherkaoui(1992)): 


respondent's  income'* 
low-income     high-income 


low-income  I       70%         |      40% 
parents'income  I I 


high-income]       45%         |       25% 


Table  1.  Percentage  of  votes  for  left-wing 
parties  as  a  function  of  individual  mobility  experience. 


That  is,  seven  out  often  lower-class  voters  born  in  the  lower-class  typically  vote  for  left-wing 
parties,  against  less  than  one  half  of  lower-class  voters  bom  in  the  middle-class.  From  this 
matrix  it  would  appear  that  parents'  income  class  determines  one's  political  attitudes  as 


compared  to  Europe,  see,  e.g.,  the  table  reported  in  Mueller(1989,  p.336). 

^A  few  studies  found  that  upwardly-mobile  agents  are  on  average  more  right-wing  than 
stable  middle-class  (mostly  in  the  US);  however  later  studies  have  shown  that  this  was  non- 
robust  (see  Thomson(1971))  and  this  thesis  has  apparently  been  abandonned. 

^This  sociology/political  science  literature  usually  cuts  the  society  into  two  halfes:  lower- 
class,  manual  occupations,  and  middle-class,  non-manual  occupations.  Although  this  is 
highly  rudimentary,  more  sophisticated  studies  conflrm  the  basic  findings  (see 
Tumer(1992)). 
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much  as  one's  current  income,  whereas  straight  economic  rationality  should  imply  that  only 

current   income   and   not   past   family   income^    should    determine   one's   interests   in 

redistribution^  as  in  the  standard  economic  models  of  redistributive  politics^. 

Our  primary  objective  is  to  provide  a  common  framework  to  account  for  these  various 
stylised  facts.  The  basic  idea  of  our  theory  is  that  although  voters  may  share  common 
distributive  goals,  through  their  various  mobility  experiences  they  (rationally)  happen  to 
learn  and  to  believe  different  things  concerning  the  openness  of  their  society  and  how 
serious  the  incentive  problem  is.  That  is,  we  model  rational  agents  as  trying  to  learn  from 
their  income  tr^ectory  not  only  the  mobility  matrix  of  their  society  but  mostly  how 
responsive  individual  promotions  and  achievements  are  to  individual  effort  (as  opposed  to 
predetermined  factors),  so  as  to  evaluate  the  incentive  costs  of  redistributive  taxation.  Such 
a  learning  process  is  essentially  of  the  same  nature  as  Rothschild(1974)'s  multi-armed 


^0  the  extent  that  the  process  of  intergenerational  income  mobility  exhibits  no  memory, 
which  seems  reasonable. 

^One  could  argue  that  not  only  redistribution  is  involved  when  voting  for  some  political 
party.  However  the  picture  survives  when  disaggregated  studies  try  to  isolate  for  attitudes 
toward  inequelity  and  redistribution.  See  the  studies  edited  by  Miller(1992). 

^See,  e.g.,  Mueller(1989)  for  the  standard  economic  models  of  redistributive  politics,  and 
Perotti(1992).  Aside  from  the  stylised  facts  mentionned  above  (which  by  nature  these 
theories  cannot  accomodate),  the  usual  median-voter  model  of  redistribution  does  not  seem 
to  be  particularly  consistent  with  the  data  (see,  e.g.,  Alesina  and  Perotti(1993)  and 
Perotti(1992)).  See  Piketty(1993)  for  an  alternative  viewpoint  on  the  political  economy  of 
redistribution  with  perfectly-informed,  selfish  voters. 
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bandit  problem^,  and  in  the  same  way,  costly  experimentation  implies  that  difTerent 

dynasties  converge  toward  difTerent  beliefs  regarding  society's  mobility  parameters  and 

therefore  different  beliefs  concerning  the  socially-optimal  redistribution  rate. 

The  key  point  is  that  in  the  long-run,  the  same  reasons  lead  some  dynasties  to  support 
higher  taxation  and  redistribution  and  at  the  same  time  to  supply  less  effort,  while  some 
other  dynasties  support  lower  redistribution  and  at  the  same  time  work  harder  to  be 
succesful;  namely,  in  the  long-run  some  dynasties  believe  (maybe  rightly)  that 
predetermined  factors  are  more  important  than  individual  effort  in  shaping  individual 
achievements,  while  some  others  believe  (maybe  rightly)  that  individual  effort  is  the  key  to 
success  and  social  rigidities  are  second-order^.  This  implies  that  in  steady-state  there  are 
more  "left-wing  dynasties"  in  the  lower-class  and  more  "right-wing  dynasties"  in  the  middle- 
class  (regardless  of  which  dynasties  have  the  "right"  beliefs,  if  any),  although  everybody 
started  with  the  same  distributive  goal.  Moreover,  upwardly-  and  donwardly-mobile  groups 
include  intermediate  fractions  of  left-wing  and  right-wing  dynasties  as  compared  to  stable 
lower-class  and  upper-class  agents,  which  leads  exactly  to  the  voting  patterns  depicted  in 
table  1. 

The  multiplicity  of  steady-states  explains  at  the  same  time  why  different  countries  can 
remain  in  different  redistributive  equilibria  although  the  underlying  structural  parameters 


^e  learning  process  under  consideration  is  actually  more  sophisticated  than  a 
standard  multi-armed  bandit  problem,  both  because  individual  learning  depends  on  some 
aggregate  variable  (redistributive  taxation)  and  because  of  the  possibility  of  learning  from 
others'  experiences;  however  this  does  not  change  the  basic  result  of  long-run  heterogeneity 
(see  below,  and  especially  section  4). 

^In  fact,  there's  a  all  continuum  in  between  these  two  extreme  dynasties. 
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of  mobility  are  essentially  the  same.  This  is  particularly  likely  if  a  country  exhibited  for 

some  time  in  the  past  a  signiflcantly  different  experience  of  social  mobility  before  joining 
the  "common"  pattern  (the  19th  century  US  had  a  significantly  different  social  structure 
before  converging  to  common  features  along  with  Europe). 

Four  different  pieces  of  evidence  lead  us  to  think  this  theory  has  some  relevance.  First, 
when  asked  what  they  think  about  inequality  and  redistribution  and  why  they  vote  the  way 
they  do,  it  appears  that  people  from  different  social  backgrounds  share  a  wide  consensus 
about  abstract  principles  of  distributive  justice  (ability  per  se  is  usually  considered  as  an 
irrelevant  basis  for  desert  unless  it  is  seen  as  being  a  result  of  previous  efforts;  people  can 
deserve  unequal  rewards  only  on  the  basis  of  features  (such  as  effort)  that  are  subject  to 
voluntary  control),  but  that  they  differ  substantially  on  practical  assessments  concerning 
the  key  to  personal  success  (the  poor  emphasizing  structural  factors,  the  rich,  personal 
qualities  such  as  effort  and  ambition)  (see  Rytina,  Form  and  Pease(1970),  Kluegel  and 
Smith(1986,  chapsJ-4),  and  Miller(1992)).  In  some  sense,  this  paper  chooses  to  take 
seriously  people's  justification  of  their  attitudes  toward  redistribution,  instead  of  describing 
them  as  egoistic  and  liar  from  the  beginning^". 

Next,  voting  patterns  exhibit  indeed  an  amazingly  high  rate  of  dynastic  reproduction: 
Abramson(1972)  shows  Italian  data  where  more  than  80%  of  voters  with  left-wing  parents 


^°One  could  obviously  argue  that  people  are  basically  egoistic  and  ex-post  "find"  some 
beliefs  to  justify  their  behavior.  But  then  one  has  to  explain  why  income  is  not  perfectly 
correlated  with  one's  vote  (see  table  1).  Methodologically,  it  makes  sense  to  assume  agents 
lie  in  survey  studies  only  if  this  necessary  to  account  for  the  actions  and  facts  under 
consideration,  which  is  not  the  case  here. 
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voted  for  left-wing  parties,  irrespective  of  their  social  class  and  their  mobility  experience. 

This  gives  a  strong  support  to  our  theory",  which  says  that  in  the  long-run  individual 

mobility  experience  has  a  substantial  but  completely  indirect  effect  on  individual  political 

attitudes:  that  is,  conditionning  individual  political  attitudes  on  parents'  political  attitudes 

cancels  almost  completely  the  effect  of  individual  social  mobility  on  voting  behaviour 

depicted  in  table  1^^ 

Also,  note  that  the  idea  that  a  common  cause  leads  some  agents  to  support  redistribution 
and  to  supply  less  effort  is  similar  to  the  old  view  that  highly  politicized  workers  do  not  try 
to  use  chances  of  social  ascent  as  much  as  workers  with  less  class  consciousness^^  (see 
Kaelble(1985,  p.6)). 

Finally,  the  view  that  there  exists  wide  and  persistent  disagreements  about  the  incentive 
costs  of  taxation  is  supported  by  the  strong  lack  of  consensus  among  economists  when  they 
attempt  to  quantify  these  costs:  everybody  agrees  that  a  90%  marginal  income  tax  rate  may 
well  disgourage  labour  supply  and  that  a  10%  rate  leaves  room  for  more  taxation,  but  the 
consensus  is  not  preserved  long  if  we  try  to  go  further.  This  is  hardly  surprising  since 
economists  face  the  same  basic  limitations  as  the  agents  described  in  this  paper:  the  only 


"It  is  hard  to  reconcile  these  very  high  rates  of  dynastic  reproduction  with  the  basic 
voting  patterns  of  table  1  without  a  theory  giving  a  common  reason  why  some  dynasties  vote 
for  more  redistribution  and  at  the  same  time  have  lower  rates  of  upward  mobility. 

'^See  also  Kelley(1992)  for  some  detailled  evidence  showing  that  the  effect  of  social 
origins  is  mostly  indirect,  i.e.  goes  through  the  parents'  political  preference  and  not  the 
class  per  se. 

^^is  example  illustrates  that  "left-wing  dynasties"  may  very  well  spend  high  effort 
levels  for  other  objectives  which  are  not  related  to  social  ascent  (such  as  trade-union 
activism  or  teaching). 
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way  to  know  for  sure  the  optimal  redistribution  rate  would  be  to  try  it  for  a  while,  and  this 

entails  substantial  social  costs.  The  difference  (hopefully)  is  that  most  agents  base  their 

assessment  on  their  limited  personal  experience  (so  that  their  eventual  beliefs  are  to  a  large 

extent  forecastable),  whereas  scholars  perform  more  sophisticated  cognitive  processes  than 

those  implied  by  Bayes'  rule,  and/or  have  more  time  to  find  more  information^'*. 

The  rest  of  this  paper  is  organized  as  follows:  section  2  sets  up  a  simple  model  of 
redistribution  and  learning;  section  3  analyzes  long-run,  steady-state  voting  patterns  and 
redistribution  rates;  section  4  shows  how  sophisticating  the  collective  learning  process  does 
not  affect  the  long-run  heterogeneity  of  beliefs  but  restricts  in  interesting  ways  the  degree 
of  heterogeneity  that  one  ought  to  observe  in  any  single  country;  section  5  attempts  to  make 
some  outside  observer's  welfare  comparisons  of  the  various  steady-states;  section  6  gives 
concluding  comments. 

Section  2  :  A  Model  of  Redistribution  and  Learning. 

In  order  to  highlight  the  heterogeneity  of  voting  behavior  stemming  from  heterogeneous 
beliefs,  we  consider  a  model  of  redistribution  where  different  income  groups  do  not  a  priori 
have  different  distributive  objectives  when  they  vote  over  redistributive  policies^.  This  may 


^"Section  5  shows  how  an  outside  observer  can  use  our  theory  and  international  evidence 
to  make  some  (limited)  progress  in  assessing  these  incentive  costs. 

^As  we  repeatedly  stress  along  the  paper,  a  model  where  voting  heterogeneity  comes 
entirely  from  heterogeneous,  well-informed  economic  interests  can  not  explain  the  voting 
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arise  either  because  redistribution  is  of  a  pure  social-insurance  nature  (each  agent  faces 

equal  chances  at  the  beginning  of  each  period),  or  because  all  agents  share  the  same 

distributive-justice  principles,  although  they  may  have  different  material  interests  in 

redistribution.  For  reasons  dicussed  above,  we  choose  to  focus  on  the  latter  case,  at  no  cost 

in  generality  (see  below). 

We  assume  a  discrete  infinite  horizon,  t=  1,2,...,  and  we  consider  an  economy  made  up  of 
a  continuum  of  agents  I=[0;1].  For  conveniency  we  shall  think  of  each  period  as  a 
generation,  and  of  each  agent  as  having  exactly  one  offspring  each  period^^. 

During  each  period  each  agent  can  obtain  one  of  two  possible  pre-tax  incomes  yg,  y^,  with 
yi>yo>0.  We  note  L,  (resp.  H,=  l-L,)  the  mass  of  agents  bom  at  time  t  in  low-income 
families  (resp.  in  high-income  families)^^. 

Agents  obtain  income  y^  or  y^  depending  on  luck,  how  much  effort  one  spent,  and  social 
origins  (i.e.  parents'income).  More  precisely,  the  probability  that  an  agents  with  social 
origins  yg  (resp.  y^)  and  with  effort  supply  e  obtains  income  y^  is  given  by 

proba(yi  |  c^q)  =  Vq  +  Qe 
(resp.  proba(yi  |  e,yi)  =  ir^  +  6e) 


patterns  of  table  1.  This  does  not  preclude  real-world  individual  concerns  for  redistribution 
to  be  some  complex  combination  of  selfish  and  "social"  values  (as  long  as  this  is  consistent 
with  table  1  and  the  observed  rates  of  political  reproduction). 

^^Although  nothing  would  be  changed  if  lifetimes  last  several  periods. 

'%  is  the  mass  of  agents  obtaining  income  y^  at  time  t-1. 


We  assume  that  0<'JTQ<ir^  to  reflect  that  children  from  high-income  families  have  access 
to  better  opportunities  (on  average).  e>0  measures  the  extent  to  which  individual 
achievement  is  responsive  to  individual  effort. 

Agents'  material  welfare  is  given  by 

U  =  E(y)  -  C(e)  '^ 
with  C(e)  =  eV2a,  a>0^'  and  E()  the  expectation 

After  they  choose  their  effort  level  e  and  their  income  shock  yg  or  y^  is  realised,  agents  vote 
over  the  redistributive  policy  t,+i  to  be  applied  next  period^°.  Thus  at  the  time  of  the  vote 
there  are  four  types  of  agents:  the  stable  lower-class,  noted  SL^  (those  whose  parents' 
income  was  yg  and  whose  income  is  also  y^),  the  donwardly-mobile,  noted  DM,  (those  whose 


^*rhat  is,  agents  maximize  their  own  one-shot  utility.  We  feel  confortable  with  this  zero- 
dicount-factor  (inflnite  time  preference)  assumption  flrst  because  the  result  of  long-run 
heterogeneity  of  beliefs  would  hold  as  long  as  the  discount  factor  is  not  too  close  to  1  (see 
Rotschild(1974)  and  Aghion  et  al.(1991)),  and  most  of  all  because  we  view  the  process  of 
learning  about  society's  mobility  parameters  (described  below)  more  as  an  unintended  side- 
product  of  social  experience  than  as  a  self-conscious  search  with  optimal  experimentation 
for  the  sake  of  future  generations  (maylie  because  people  do  not  feel  there  is  anything 
stationary  to  learn  about  mobility).  Therefore  assuming  a  zero-discount-factor  for  this 
learning  process  is  not  contradictory  with  substantial  intergenerational  altruism. 

^^e  assume  a  to  be  small  enough  so  that  probabilities  will  always  be  between  0  and 
1.  We  choose  this  simple  functional  form  for  C(e)  for  the  sake  of  notational  simplicity. 

^As  to  why  people  go  and  vote  despite  their  negligible  importance,  we  have  nothing 
original  to  say.  Assume  for  example  that  the  continuum  economy  we  described  so  far  is  in 
fact  a  large  finite  economy  with  some  positive  probability  of  being  the  decisive  voter  (the 
economy  must  be  sufficientely  large  so  that  agents'  "social"  concerns  do  not  show  up  when 
choosing  effort  levels). 
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parents'  income  was  yj  and  who  have  gone  down  to  yg),  the  upwardly-mobile,  noted  UMj 

(those  whose  parents'  income  was  y^  and  who  have  moved  up  to  yo)  ,  and  the  stable  high- 
income  (or  middle-class,  noted  SH,  (those  whose  parents'  income  was  y^  and  whose  income 
is  also  y,). 

We  assume  that  when  voting  over  redistribution  these  difTerent  agents  share  the  same 
social  welfare  function.  To  fix  ideas,  we  assume  that  they  all  think  that  unequal 
opportunities  (i.e.  irQ<tT^)  is  a  bad  thing,  and  that  the  state  should  try  to  correct  this  as 
much  as  possible,  i.e.  should  try  to  maximize  the  expected  welfare  of  lower-class  children 
by  redistributing  income  from  y^  to  y^^^  Those  readers  who  feel  unhappy  with  this  social 
objective  can  replace  it  by  another  social  welfare  function  (such  as  the  utilitarian  sum  of 
utilities,  assuming  risk  aversion),  without  changing  the  substance  of  what  follows  (see 
below). 

The  important  point  is  that  every  voter  is  going  to  balance  the  social  benefits  of  equalizing 
opportunities  with  the  incentive  costs  of  taxation.  That  is,  setting  a  tax  rate  r^^^  will  lead 
period-t  +  l's  agents  to  choose  an  effort  level  e(  1,^.1,6)  maximizing  their  own  expected 
welfare^^: 


'^Here  we  assume  that  the  only  redistributive  policy  tool  available  is  pure  income 
redistribution  (i.e.  tax  income  at  rate  r  and  redistribute  everything  in  a  lump-sum  way). 
Nothing  would  be  modified  if  we  assumed  that  the  state  could  use  public  money  to  act 
directly  on  the  high-achievement  probability  ttq  of  lower-class  children  (for  example  through 
public  schooling). 

^Obviously,  there's  nothing  contradictory  between  maximizing  a  "social"  objective 
function  when  voting  and  maximizing  private  welfare  when  choosing  one's  effort  level:  in 
the  latter  case,  no  positive-mass  effect  is  imposed  on  the  aggregate.  This  is  the  traditionnal 
distinction  between  private  and  social  values  (see,  e.g.,  Arrow(1963,  p.l8)). 
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e(T,^i,6)  =  ArgMax,>o  ee(l.T,^i)(yi-yo)  -  C(e) 
that  is:   e(T,^i,e)  =  ae(l-T,^i)(yi.yo) 

Taking  this  into  account,  the  tax  rate  t,  +  i  maximizing  the  expected  welfare  of  lower-class 
children  at  period  t  +  1  is  given  by 

T,,i(iri-7ro,8)  =  ArgMax^,o    (7ro  +  ee(T,e))(l-T)yi  +  (l-7ro-ee(T,e))(l.T)yo 
+  T[iiT,L,,,+n,H,,,  +  6eiT,Q))(y,.y,)+y,]  -  C(e(T,e)) 
that  is: 

Tt+i('ri-Wo»6)  =  Ht^i(ffi-7ro)/a(yi-yo)82 

Unsurpringly,  the  socially-optimal  tax  rate  is  an  increasing  function  of  (tTj-tTq)  and  a 
decreasing  function  of  6:  the  larger  the  inequality  of  opportunity  tt^-iTq,  the  more  it  needs 
to  be  corrected,  and  the  higher  the  income  elasticity  6  with  respect  to  effort,  the  more  severe 
the  incentive  problem^.  Note  that  these  properties  do  not  depend  on  the  particular  social 
welfare  function  that  we  chose  for  illustrative  purposes^. 

Now,  assume  that  initially  agents  have  different  beliefs  on  society's  structural  parameters 


^Note  also  that  no  public  intervention  is  required  if  opportunities  are  equal,  i.e.  tTj- 
iro=0  (this  is  because  we  assumed  no  risk  aversion),  and  that  more  equalization  of 
opportunities  is  less  costly  when  the  society  is  richer  (i.e.  V^^^  larger). 

^In  particular  the  same  properties  would  hold  if  one  maximizes  any  (weighted- 
)utilitarian  social  welfare  function  (assuming  positive  risk  aversion,  otherwise  the  optimal 
utilitarian  tax  rate  is  always  0). 
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(jro,7ri,e).  That  is,  all  agree  that  opportunites  are  to  some  extent  unequal,  but  some  think 

that  the  "deterministic"  difference  in  opportunities  tt^'ITq  is  small  as  compared  to  the 

importance  6  of  individual  effort  in  shaping  individual  achievements,  so  that  they  want  very 

little  state  intervention  so  as  not  to  offset  individual  incentives;  whereas  some  others  agree 

that  incentives  are  a  problem,  but  that  overall  6  is  sufficientely  small  as  compared  to  ^^-iTq 

(that  is,  structural  and  predetermined  factors  outweight  individual  factors)  so  that  the  state 

can  play  a  substantial  role  in  raising  revenue  to  equalize  opportunities  without  that  much 

harm. 

The  question  we  want  to  investigate  is  the  following:  assume  there  is  some  "true", 
stationaiy  set  of  parameters  (7ro*,7ri*,e*);  what  happens  in  the  long-run  if  agents  start  with 
different  beliefs  on  these  parameters?  what  do  the  long-run  voting  patterns  look  like?  what 
role  is  played  by  social  mobility  in  this  learning  and  voting  process? 

To  answer  these  questions,  we  must  first  specify  how  agents  learn  about  society's  mobility 
parameters.  Each  dynasty  iel  starts  at  t=0  with  some  prior  belief  Mio(>)  defined  over  the  set 
of  all  logically  possible  (7ro,7ri,8),  chooses  an  effort  level  ejo(Mio(')>To)  given  some  (arbitrary) 
tax  rate  Tq  to  start  with,  rationally  updates  its  belief  given  its  income  achievement,  takes 
part  to  the  voting  process  over  t^  by  supporting  what  one  believes  to  be  the  socially-optimal 
policy  T,i(/iii(.))  given  the  posterior  belief  Mii(0^.  and  finally  transmits  its  posterior  to  its 


^HVe  assume  here  that  every  voter  computes  the  socially-optimal  tax  rate  as  if  he 
thought  that  everybody  shares  his  beliefs;  otherwise  he  should  take  into  account  the  fact 
that  others  may  take  higher  or  lower  effort  levels  than  he  thought  they  should.  However 
assuming  that  he  has  some  beliefs  over  the  distribution  of  beliefs  and  that  he  fully  takes 
into  account  these  effects  would  not  change  anything  essential;  i.e.  agents  believing  in  higher 
Bs  would  still  tend  to  prefer  lower  tax  rates:  if  everybody  observes  a  common  signal  6  of  the 
average  beliefs,  then  the  most-preferred  tax  rate  T(7ri-7ro,6i,0)  as  a  function  of  one's  own 
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ofTspring;  and  so  on... 

Note  that  as  far  as  effort-taking  and  voting  is  concerned,  only  the  average  of  the 

probability  measures  ^l^^{.)  are  relevant:  by  linearity, 

ei,(T„Mi,)  =  e(T„e(Mi,)) 
Tit+i(Mi,)  =  T(jri(/ii,)-7ro(/Xi,),6(Mi,)) 
with   TToinJ  =  /  JTod/Xj,  ,  Jri(/i„)  =  /  ir^dn^^,  Bin^^)  =  /  Gd/ii, 

This  is  not  the  case  for  the  bayesian  updating  process,  however:  the  entire  belief  matters. 
Note  also  that  Bayes'  rule  puts  few  restictions  on  short-run  learning  from  one's  own 
experience:  one's  effort  level  and  political  attitudes  can  go  in  every  direction  following,  say, 
an  upwardly  mobile  trajectory,  depending  on  how  initial  beliefs  determine  the  interpretation 
of  the  event;  we  shall  see  that  this  ambiguity  disappears  in  the  long-run  (as  arbitrary  priors 
disappear).  Finally,  note  that  the  voting  process  is  perfectly  standard:  preferences  over  tax 
rates  are  single-peaked  around  the  most-preferred  tax  rates  T^^^^(^iJ,  and  the  median  of 
these  rates  is  elected  and  becomes  t,^.j^. 

This  collective  learning  process  is  defined  as  a  sequence  of  independent  single-dynasty 
learning  processes,  except  that  individual  experimentation  is  influenced  by  some  collective 


beliefs  8^  is  equal  to  Hj(7ri-7ro)/a(yi-yo)8^  +(l-8,)/e,  which  is  still  decreasing  in  8^  for  a  given 
8. 

^In  lack  of  a  better  theory  of  political  parties,  we  thus  assume  them  to  be  purely 
opportunistic.  Nothing  essential  would  be  changed  to  individual  learning  processes  had  we 
assumed  partisan  parties  with  exogeneous  objective  functions  or  beliefs  (in  particular 
proposition  3  would  still  hold). 
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variable  (the  redistribution  rate  t,)  determined  by  the  cross-section  distribution  of  beliefs. 

In  effect,  the  learning  process  we  specifled  above  is  fully  rational  if  one  assumes  that  each 

single  dynasty  to  observe  only  its  own  economic  achievement^'  and  knows  nothing  about 

the  rest  of  the  system^.  We  feel  confortable  about  this  assumption,  first  because  most 

agents  have  access  to  little  hard  information  beyond  their  immediate  family  circle  (all 

results  would  survive  if  some  finite  groups  of  dynasties  share  common  information  and 

experimentation^^),  and  mostly  because  only  under  extreme  assumptions  can  one  learn 

substantially  more  by  looking  at  cross-section  aggregates  (such  as  the  cross-section 

distributions  of  beliefs,  income  and  mobility  rates);  we  discuss  these  issues  in  section  4. 

Section  3  :  Steady-State  Political  Attitudes. 

The  first  property  of  this  dynamic  process  of  learning  and  voting  is  that  it  converges;  that 


^'One  can  assume  either  that  each  dynasty  behaves  as  a  single  infinite-horizon  bayesian 
agent  (parents  "transmit"  their  posterior  to  their  offspring  in  the  same  way  as  a  single  agent 
uses  last  period's  posterior  as  his  new  prior),  or  that  each  new  generation  starts  its  learning 
life  with  their  own  prior  and  observes  the  entire  mobility  history  of  their  family  or  observes 
only  their  parents'  posterior  and  know  that  they're  bayesian  (this  is  equivalent). 

^^That  is,  they  vote  for  the  redistribution  they  prefer  when  given  the  choice,  but  they 
don't  where  the  equilibrium  redistribution  comes  from  and  don't  try  to  assess  its 
informativeness  (we  show  in  section  4  that  the  latter  is  very  limited,  anyway). 

^^The  point  is  that  finite  groups  of  dynasties  observing  each  other's  experimentation  will 
tend  to  experiment  the  same  way  (i.e.  to  take  the  same  effort  level)  given  time  preference. 
A  given  experimentation  would  still  give  more  information,  but  this  would  only  change  the 
size  of  the  pertubations  required  to  remove  some  wrong  belief,  without  affecting  the  set  of 
stable  beliefs  defined  next  section;  as  the  size  of  the  groups  goes  to  infinity,  infinitely  small 
pertubations  remove  every  beliefs  except  the  correct  one. 
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is,  in  the  long-run  beliefs  about  society's  mobility  parameters  and  the  resulting  equilibrium 

tax  rate  are  stationary^.  This  is  a  direct  consequence  of  the  martingale  convergence 

theorem. 

Proposition  1.  For  every  dynasty  id,  the  belief /iit(.)  converges  with  probability  1  toward 
some  stationary  belief  Mi«(«)  as  t  goes  to  «.  The  equilibrium  tax  rate  t,  converges  toward 
some  tax  rate  r^ 

Proof.  For  any  given  tax  rates  sequence  (t,),>o,  the  stochastic  process  (Mit('))t>o  is  defined 
by  a  standard,  fully-rational  process  of  bayesian  updating,  and  as  such  has  the  martingale 
property  (see,  e.g.,  Aghion  et  al.(1991)).  Thus  the  martingale  convergence  theorem  applies, 
and  the  society  converges  toward  some  stationary  set  of  beliefs  (Mio(>))iei>  It  follows  that  the 
equilibrium  tax  rate,  as  a  continuous  function  of  these  beliefs,  converges.  CQFD. 

Now,  the  interesting  question  is  whether  every  dynasty  necessarily  adopts  the  same  belief 
in  steady-state,  and  whether  the  long-run  tax  rate  is  necessarily  equal  to  the  "true"  socially- 
optimal  tax  rate.  Those  readers  who  are  familiar  with  Rotschild(1974)'s  two-armed  bandit 
problem  shouldn't  be  too  surprised  that  the  answer  to  both  questions  is  no^':  learning 


^Obviously,  this  would  not  be  true  if  society's  mobility  parameters  are  not  stationary, 
which  may  well  be  the  case  in  practice.  We  leave  this  for  future  research. 

^^Each  effort  level  is  an  arm  of  this  continuous-armed  bandit  problem  (note  that  we 
make  learning  much  easier  by  assuming  a  linear  functional  form  between  effort  and 
mobility  probabilities;  see  section  4). 
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about   the   relative   importance  of  luck  and   individual   effort   in   shaping   individual 

achievements  is  just  as  complicated  as  learning  the  average  payoff  of  a  multi-armed 
bandit's  arms.  Indeed,  knowing  how  one's  social  achievement  (as  a  function  of  social 
origins)  responds  to  variations  in  individual  (lifetime)  effort  would  require  a  lot  of  costly 
experimentation;  namely,  several  generations  would  have  to  "sacrifice"  their  life  by  trying 
to  supply  no  effort  at  all  or  to  work  like  mad  in  order  to  see  what  happens  to  their  socio- 
economic status!  Therefore  in  general  not  everything  is  learned  in  the  long-run,  and  initial 
beliefs  and  social  mobility  trajectories  play  a  key  role  in  shaping  actual  long-run  beliefs  and 
political  attitudes. 

Although  it  is  impossible  to  derive  analytically  the  mapping  from  initial  beliefs  (Mio(O)ici 
to  long-run  beliefs  (Mi«(0)iti>  we  are  able  to  say  which  long-run  beliefs  (Mi«(0)i£i  are  more 
likely  to  be  observed  than  others  by  appealing  to  some  stability  criterion.  Indeed  if  we  just 
define  an  "observable"  steady-state  as  a  set  of  beliefs  reproducing  itself,  then  (almost) 
anything  is  "observable",  and  in  particular  any  set  of  point-beliefs:  by  definition  of  bayesian 
updating,  one  can  never  learns  what  was  ruled  by  the  prior.  However  some  of  these  beliefs 
are  much  less  likely  to  be  observed  than  others:  typically,  beliefs  generating  expected 
mobility  probabilities  that  are  different  from  the  actual  mobility  frequencies  are  very 
unstable,  and  conversely.  That  is,  we  define  A(t)  as  the  set  of  iiTQ,iT^,B)  such  as  the  optimal 
effort  level  e(T,0)  associated  to  the  point-belief  l^^ie  and  the  tax  rate  t  generates  a 
statistical  distribution  of  high  and  low  incomes  identical  to  that  expected  by  an  agent  with 
prior  1^,1  e;  that  is,  A(t)  is  the  set  of  all  (7rQ,7ri,6)  such  that 
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jtq  +  ee(T,e)  =  ttq*  +  e*e(T,e) 
iTj  +  ee(T,e)  =  TTi*  +  e*e(T,0) 

That  is: 

A(T)  =  (  (7ro(e),jri(e)  =  7ri*-7ro*+7ro(e),e)e>o) 
with   Wq  =  Vq*  +  (e*-8)e(T,0) 

[Figure  1  represents  the  locus  A(t)] 

Intuitively,  if  a  belief  n^„i.)  and  a  tax  rate  t„  lead  to  an  effort  level  ej„  such  that  the 
statistical  distribution  of  upwardly-  and  downwardly-mobile  trajectories  that  does  not 
coincide  with  that  expected  by  the  agent,  any  small  deviation  from  Mi«(')  in  any  direction 
that  gets  it  closer  to  the  true  distribution  will  be  recognized  by  the  agent  (with  some  positive 
probability);  conversely,  if  the  "true"  statistical  distribution  and  the  expected  distribution 
do  coincide  most  pertubations  won't  be  recognized;  the  point  is  that  there  are  many  beliefs 
generating  actions  such  as  the  expected  distribution  and  the  statistical  distribution 
coincide  (see  figure  1):  it  is  difficult  to  realize  that  one  puts  too  much  weight  on  effort  if 
one  puts  at  the  same  time  too  little  weight  on  predetermined  factors.  We  say  that  a  steady- 
state  as  stable  if  it  consists  of  beliefs  that  cannot  be  removed  by  all  small  deviations  in  the 
direction  of  either  lower  or  higher  mobility  than  previuosly  expected  (see  the  appendix)^^; 


^^One  may  want  to  define  stability  as  the  property  that  any  individual  deviation  in  some 
sufflcientely  close  neighborhood  of  one's  stationary  belief  converges  toward  the  initial 
stationary  belief  with  probability  1.  However  this  is  too  demanding:  no  steady-state  is  stable 
according  to  this  definition  (not  even  the  "true"  belief  1(xo-,tiv9'))-  This  comes  from 
topological  problems  similar  to  those  met  by  Aghion  et  al.(1991,  pp.636-637):  "closeness  of 


Ti^*^c(z.i)-.~,+6e[r,s) 


'o^V 
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this  implies  that  stable,  stationary  beliefs  must  have  their  averages  on  A(Ta). 


Proposition  2.   ((Mi«(.))i,i>  tJ  is  a  stable-steady  state  iff 

(1)  Vi€l,  (iro(;ii.),jri(MiJ,e(Mi.))  e  A(t.) 

(2)  T„  is  the  median  of  (Tj,);,, 

Proof,  see  the  appendix. 

What  proposition  2  tells  us  is  that  for  any  stable  steady-state  all  dynasties  can  be  ranked 
along  a  one-dimensional  scale,  namely,  their  position  on  the  curve  A(tJ.  That  is,  in  the 
long-run  all  dynasties  believe  that  the  "pre-determined"  opportunity  difference  tTi-tTq 
between  lower-class  and  middle-class  children  is  (on  average)  7ri*-7ro*  (the  "true" 
opportunity  difference),  but  they  have  different  estimates  of  6,  i.e.  of  how  much  individual 
effort  can  undo  the  effects  of  social  rigidities.  All  dynasties  are  mobile,  so  that  one  can  And 
proponents  of  all  redistributive  policies  in  every  income  group.  But  the  point  is  that  because 
the  same  beliefs  lead  some  dynasties  to  supply  less  effort  and  to  support  more 
redistribution,  in  steady-state  there  are  more  left-wing  voters  among  the  lower-class 
(irrespective  of  who  has  got  the  right  belief,  if  any),  and  the  political  composition  of  socially 
mobile  agents  is  strictly  intermediary  between  that  of  the  stable. 

To  see  that  note  that  those  dynasties  id  who  have  have  converged  toward  a  higher 


beliefs"  in  the  topological  sense  allows  for  too  many  deviations. 
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e  =  e(/X|J  vote  for  less  redistributive  policies  (T(Wi-iro,8)  is  decreasing  with  9)  and  supply 

more  effort  (e(T„e)  is  increasing  with  8)  so  that  a  higher  fraction  of  them  H,(8)  has  a 
high-income  in  steady-state;  indeed,  H,(8)  is  given  by  the  condition  that  the  mass  going  out 
of  the  high-income  class  is  equal  to  the  mass  coming  in: 

(7r,*+8*e(T„8))H(e)  +  (V+e*e(T«8))L(8)  =  (l.iri*.8*e(T«8))H(8) 
that  is  :     H.(8)  =  [1.2(7ri*.iro*)]/[1.27ri*+iro«-8*e(T„8)]  -  1 
so  that  H.'(8)  >0     (as  long  as  H.(8)  <  1) 

It  follows  that  a  higher  fraction  of  lower  tax  rates  supporters  has  a  high  income.  In  the 
same  way,  lower  tax  rates  supporters  have  a  higher  probability  of  being  upwardly-mobile 
than  stable  in  the  lower  class,  but  a  lower  probability  of  being  upwardly-mobile  than  stable 
in  the  middle  class.  Indeed  the  steady-state  fractions  of  8-dynasties  who  are  upwardly 
mobile  UM,(e),  downwardly-mobile  DM,(e),  stable  at  high-income  SH„(8)  and  stable  at 
low-income  SLe(6)  are  given  by 

UM.(e)  =  (v+e*e(T„8))u.(e) 
DM.(e)  =  (i-7ri*.e*e(T„e))H,(e) 

SH.(8)  =  (7ri*  +  8*e(T„8))H.(8) 
SUO)  =  (l-7ro*-e«e(T„e))U(8) 

It  follows  that  the  fraction  of  8-dynasties  who  are  mobile  as  compared  to  the  fraction  of  8- 
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dynasties  who  are  stable  at  high-income  (resp.  low-income)  decreases  (resp.  increases)  with 

9.  Therefore  the  mobile  as  a  whole  have  political  orientation  which  are  intermediate 

between  those  of  the  stable. 


Proposition  3.   In  any  stable  steady-state,  the  voting  patterns  mimic  those  presented  in 
table  1.  That  is,  for  any  two  redistributive  policies  t,t',  with  t>t', 

H,(T,T')    <    U(T,T') 
SH.(T,T')    <   UM,(T,T'),DM.(T,T')    <    SL.(T,T') 

where  X(t,t')  is  the  fraction  of  class  X  preferring  t  to  t' 

Proof.  Because  preferences  over  tax  rates  are  single-peaked,  there  exists  t",  with  t  >  t">  t', 
such  that  dynasties  iel  preferring  t  to  t'  are  those  whose  most-preferred  tax  rates  T(8(/i|J) 
is  above  t",  i.e.  those  whose  e(/iioo)  is  below  some  6".  Since  the  fraction  of  9-dynasties 
H.(6)  obtaining  a  high-income  in  steady-state  increases  with  6,  the  fraction  of  the  high- 
income  class  whose  e(/ii«)  is  below  some  6"  is  lower  than  that  of  the  low-income  class. 
Similarly,  because  SH.(e)/DM.(e)  and  SH.(e)/UM.(e)  increase  with  6, 
SH.(T,T')  <  UM,(T,T')  and  SH.(t,t')  <  DM.(t,t'),  and  conversely  with   SL^  COFD. 

Thus  in  the  long-run  social  origins  have  an  effect  on  political  attitudes  because  only 
because  they  are  informative  on  which  type  of  dynasty  one  belongs.  Prior  to  convergence 
however,  one  cannot  completely  distinguish  between  the  indirect  and  the  direct  effect:  many 


21 
lower-class  agents  are  in  the  lower-class  because  their  ideology  does  not  push  them  to  work 

hard  to  be  promoted,  but  also  their  poor  economic  performance  conflrms  their  initial 

ideology  (and  conversely  for  the  right-wing  ideology). 

Section  4  :  Robustness  of  Long-Run  Heterogeneity. 

We  now  address  the  issue  of  learning  by  looking  at  cross-section  aggregates  and  show  that 
this  constraints  in  interesting  ways  the  steady-states  without  afliecting  the  flavour  of  the 
main  results  (i.e.  long-run  beliefs  heterogeneity  and  proposition  3)^^. 

Consider  first  how  much  agents  can  learn  by  observing  the  cross-section  distribution  of 
beliefs.  If  they  can  observe  the  exact  beliefs  ^i^^i.)  of  other  dynasties,  then  in  "steady-state" 
they  can  infer  the  true  iiTQ*,ir^*,B*);  this  is  because  the  curve  A(t„)  of  stationary  beliefs 
depends  on  the  true  {irQ*,iT^*,Q*):  some  dynasty  i  believing  in  /ii«(')  >s  ready  to  accept  that 
some  other  dynasties  have  some  "wrong"  stationary  beliefs,  but  they  must  be  on  the  curve 
A(TO(M.«).Ti(Mi-),e(Mi-))(Tj  defmed  by  replacing  (^o*''^!*'^*)  by  (■!ro(^i^),■n^i^l^),ei^i^))  in  the 
definition  above^;  since  these  two  curves  do  not  coincide  dynasty  i  should  realize  that 
/Xi«,(.)  can't  be  the  right  belief  (see  figure  2).  Thus  assuming  that  agents  observe  where  the 
cross-section  distribution  of  beliefs  is  exactly  located  in  the  space  of  beliefs  implies  that  the 


^^e  discuss  these  issues  without  doing  any  kind  of  cost/benefit  analysis  of  information 
acquisition.  The  lack  of  consensus  about  these  issues  among  those  who  spend  their  life 
studying  them  (see  section  1)  suggests  that  the  costs  of  acquiring  the  relevant  pieces  of 
information  are  quite  high,  although  the  collective  benefits  may  be  high. 

^Obviously,  this  is  assuming  that  each  agent  is  able  to  write  and  solve  the  model  as  we 
did  last  section,  which  is  quite  demanding.  Otherwise,  there  is  not  much  to  be  said. 


A 
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only  stable  steady-state  involves  everybody  learning  the  true  (irQ*,n^*,Q*). 

However  this  inference  relies  on  the  unrealistic  assumption  that  agents  can  observe  the 
exact  probability  measure  n^(.)  of  other  agents;  in  practice,  the  only  "material"  expression 
of  a  belief  is  an  effort  level  e^^in^J  and  a  most-preferred  tax  rate  Tj«,(/ii«),  and  an  agent  can 
observe  at  most  the  cross-section  distribution  of  these  choice  variables.  These  are  veiy 
insufFicient  statistics  for  the  entire  beliefs,  and  one  can  easily  see  from  flgure  2  that  a 
rational  bayesian  updater  can't  learn  much  by  observing  these  cross-section  distributions: 
the  only  steady-states  that  are  ruled  out  are  those  involving  extreme  left-wing  agents  and 
extreme  right-wing  agents  at  the  same  time.  This  is  because  self-consistent,  fully-rational 
left-wing  dynasties  believing  in  a  low  Q(n,J  believe  that  the  maximum  steady-state  "mistake" 
•s  6majj(e(MiJ)>  which  is  lower  than  the  "true  maximum  mistake"  9max(^*)  (see  figure  2);  that 
is,  if  these  dynasties  observe  too  right-wing  dynasties,  they  must  infer  that  something  is 
going  wrong  and  revise  these  beliefs^.  Therefore  any  steady-state  must  be  such  that  the 
most  left-wing  beliefs  ^t^„  and  the  most  right-wing  beliefs  are  mutually  compatible,  i.e.  are 
such  that  6^^(e(Mi«))ie(/XjJ,  which  limits  slightly  the  set  of  steady-states  defined  in 
proposition  2. 

The  other  inference  that  fully-rational  agents  could  make  by  looking  at  the  cross-section 
distribution  of  beliefs  (as  expressed  by,  say,  the  most-preferred  tax  rates)  would  be  to  think 
about  what  the  initial  distribution  of  priors  was  when  this  all  process  started  long  ago  in 
the  past,  to  compute  what  steady-state  distribution  of  beliefs  this  implies  as  a  function  of 


^Note  that  this  reasonning  does  not  hold  for  extreme  right-wing  dynasties,  who  can 
move  on  with  their  extreme  beliefs  without  ever  worrying  about  the  extreme  left-wing  people 
in  the  other  comer! 
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the  true  (iro*,jri*,e*),  and  then  to  try  to  invert  this  mapping.  This  strikes  us  as  an  extreme 

implication  of  bayesian  rationality  (in  the  history  of  inequality  and  redistribution  the  notion 

of  a  time  0  is  rather  ellusive,  and  for  sure  real-world  agents  don't  perform  this  thought 

experiment),  but  in  any  case  this  would  not  result  into  adequate  learning.  The  reason  is 

that  this  mapping  has  no  reason  to  be  invertible:  the  dimension  of  the  set  of  long-run 

distribution  of  observed  actions  has  no  reason  in  general  to  be  larger  than  the  dimension 

of  the  set  of  unknown  parameters^,  so  that  observing  the  former  gives  limited  information 

on  the  latter.  Therefore  such  an  inference  process  could  reduce  somewhat  the  set  of 

"admissible"  steady  states^^  but  would  not  change  the  basic  result  of  long-run  heterogeneity 

and  voting  patterns^. 

Consider  now  what  agents  could  learn  by  oberving  both  the  cross-section  distribution  of 
actions  and  the  cross-section  income  distribution  and  mobility  rates.  Assume  first  that  they 
observe  the  steady-state  distribution  of  most-preffered  tax-rates  (T^^in^J)^^^  (or  of  effort 


^In  our  model,  the  set  of  all  logically  possible  distributions  of  actions  does  have  higher 
dimensionality  than  the  set  of  possible  parameters,  but  this  is  an  artifact  of  the  two-income 
modelling  and  of  the  linearity  of  the  mobility  process:  in  general  the  set  of  all  parameters 
required  to  describe  how  the  entire  mobility  matrix  responds  to  effort  is  veiy  likely  to  be 
much  higher-dimensional  than  the  observable  part  of  the  distribution  of  actions. 

^^l^pically,  this  could  again  rule  out  steady-states  with  too  extreme  beliefs  on  both 
sides,  since  this  inference  process  gives  the  same  information  (if  any)  to  everybody  which 
implies  a  relative  "leveling"  of  all  beliefs. 

^*rhis  confirms  the  flnding  by  Smith(1991)  that  this  kind  of  inference  from  observing 
others'  actions  can  result  into  adequate  learning  only  under  extreme  assumptions:  although 
Smith  makes  the  adequate  dimensionality  assumptions  to  avoid  invertibility  problems, 
adequate  learning  occurs  only  if  the  distribution  of  priors  contains  unboundedly  informative 
priors  (but  convergence  toward  a  common  belief  is  guaranteed). 
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levels  (ei,(Mi«))i<i)  and  the  steady-state  distribution  of  income  (U,Ji^=l-LJ.  By  observing 

the  distribution  of  actions  they  can  compute  the  average  effort  level  e„=/ei,(/ij.)di,  and 

combined  with  the  knowledge  of  H,  this  gives  to  every  agent  the  true  knowledge  of  the 

probability  of  obtaining  a  high  income  with  effort  e«:  the  true  iiTQ*,w^*,Q*)  must  verify 

iro*L„+iri*H,+6*e,=  H^  Combined  with  one's  own  beliefs  Mi«  this  allows  every  agent  to  infer 

the  true  (7ro*,iri*,e*). 

Again,  such  a  successful  inference  relies  on  very  extreme  informational  assumptions:  that 
is,  the  aggregate  distribution  of  effort  levels  per  se  is  certainly  not  observable^^,  and  the 
distribution  of  most-preferred  tax  rates  is  observable  only  to  the  extent  that  they  materialize 
into  actual  support  for  some  political  party.  Assume  for  example  that  one  can  only  observe 
the  eventual  equilibrium  tax  rate  r„  and  that  the  latter  is  known  to  be  the  median  of  the 
distribution  of  most-preferred  tax  rates^.  Then  going  from  this  observation  to  the 
knowledge  of  the  average  effort  level  e.  is  litterally  impossible:  distributions  with  a  fixed 
mean  can  have  all  sorts  of  average,  plus  the  relation  between  effort  levels  and  most- 
preferred  tax  rates  is  non-linear. 

Assuming  that  one  can  observe  the  relative  popularity  of  two  exogeneously-given  political 
parties  or  the  actual  mobility  rates  would  not  alter  the  basic  message:  by  observing 
aggregate  characteristics  of  the  collective  learning  process  one  can  get  at  most  an 
approximate  knowledge  of  the  "aggregate  experiment";  this  can  possibly  allow  some  limited 


^'others'  effort  levels  are  possibly  observable  for  small,  finite  groups  of  agents,  which 
would  not  change  anything  to  the  set  of  stable  steady-states  (see  footnote  29). 

*\Vhich  in  practice  is  quite  speculative. 
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inference,  but  in  any  case  extreme  assumptions  are  required  to  obtain  adequate  learning. 

In  sum,  even  assuming  that  agents  have  very  sophisticated  cognitive  ability  (they  know  the 
right  model  and  are  able  to  compute  its  dynamic  properties),  no  realistic  assumption 
regarding  what  agents  can  observe  afTects  signiflcantly  the  analysis  of  the  collective  learning 
process:  inferring  information  from  looking  at  cross-section  aggregates  can  typically  rule 
out  steady-states  with  too  extreme  beliefs  and  therefore  implies  a  relative  "homogeneisation" 
of  steady-states^^  but  cannot  remove  the  long-run  heterogeneity  of  beliefs. 


Section  5  :  Some  Welfare  Analysis. 

Now  consider  an  outside  observer  knowing  the  model  and  looking  at  the  pieces  of 
international  evidence  that  we  have  on  inequality,  mobility  and  redistribution  in  western 
democracies.  Assume  also  that  this  outside  observer  is  ready  to  assume  that  these  countries 
have  the  same  structural  parameters  {iTQ*,Tr^*,Q*).  The  first  piece  of  international  evidence 
is  that  important  and  fairly  stable  differences  in  levels  of  redistribution  are  being  observed: 


■^^The  observation  of  a  common  signal  always  has  a  "levelling  effect"  on  a  set  of 
heterogeneous  beliefs.  Note  that  there  is  another  reason  why  too  different  beliefs  cannot 
sustain  in  steady-state,  which  operates  through  the  influence  of  the  equilibrium 
redistribution  on  individual  experimentations:  for  example  in  a  country  with  a  tradition  of 
a  low  T  supplying  little  effort  is  more  costly  (for  a  given  left-wing  belief)  and  therefore  it 
is  harder  to  leam  about  a  possible  low  6*.  We  believe  these  effects  explain  partly  the 
political  homogeneity  of  countries  like  the  US  and  why  the  spectrum  of  political  attitudes 
overlaps  so  little  between  both  sides  of  the  Atlantic. 
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typically,  there  tends  to  be  much  less  redlstributive  transfers  in  the  United  States  than  in 

Western  Europe  and  especially  Scandinavia''^  From  this  one  can  infer  that  these  countries 
are  in  different  steady-state  equilibria  of  the  model  (this  is  confirmed  by  the  observation 
that  working  hours,  i.e.  some  limited  signal  of  effort,  tend  to  be  longer  in  the  US).  Of 
course,  if  the  outside  observer  looking  at  these  countries  knows  the  true  parameters,  he  can 
easily  say  which  country  redistributes  too  much,  which  country  works  too  much,  which 
agents  have  a  "wrong"  ideology,  and  so  on:  he  knows  that  the  "truly  optimal"  rate  of 
redistribution  t*,  effort  level  e*  and  GNP  L*yo  +  H*yi  are  given  by  the  true  parameters 
(iro*,iri*,e*). 

But  if  the  outside  observer  does  not  know  a  priori  which  beliefs  are  the  right  ones  (just 
as  us),  what  can  he  say  if  he  wants  to  compare  the  actual  welfare  of  these  different 
dynasties  and  countries?  The  answer  may  first  seem  to  be:  not  much.  Indeed,  one  can  find 
steady-state  where  the  agents  spending  the  highest  ammount  of  effort  are  in  fact  not 
working  enough  (given  the  true  returns  to  effort),  and  others  where  the  agents  spending  the 
lowest  ammount  of  effort  work  too  much.  Maybe  there  is  too  much  redistribution  in  the  US, 
and  maybe  there  is  too  little  in  Sweden. 

In  a  desesperate  need  to  refine  his  beliefs,  the  observer  may  compare  the  GNPs  of  these 
different  countries:  the  theory  predicts  that  a  country  with  less  redistribution  should  have 
a  higher  GNP  (whatever  the  true  parameters),  and  that  this  should  be  all  the  more  so  if  the 
incentive  problem  is  more  severe  (that  is,  if  the  true  social  optimum  is  relatively  little 


'*^It  is  hard  to  give  a  global  quantification  of  this  multi-dimensional  phenomenon. 
Mueller(1989,p.326)  presents  some  data  showing  that  the  size  of  transfers  as  a  fraction  of 
GNP  is  twice  as  large  in  Western  Europe  than  in  the  US. 
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redistribution).  Here  the  evidence  is  not  very  conclusive:  EC  countries  tend  to  have  a 

somewhat  lower  GNP/capita  than  the  US,  but  this  is  not  so  for  Scandinavia.  Coming  down 

to  less  and  less  secure  grounds,  the  observer  may  want  to  compare  mobility  rates:  the  theory 

again  predicts  that  countries  with  less  redistributive  taxation  should  have  higher  mobility 

rates,  and  again  this  should  show  up  particularly  strongly  if  individual  incentives  play  the 

key  role  postulated  by  these  countries.  The  striking  observation  here  is  that  all  quantitative 

studies  that  have  tried  to  compare  mobility  rates  across  developed  countries  have  concluded 

that  these  rates  were  amazingly  similar  (see  the  references  given  in  section  1).  The  observer 

may  choose  to  conclude  that  since  the  rigid,  redistribution-intensive  societies  of  Western 

Europe  are  as  mobile  as  the  US,  there  is  little  reason  to  believe  in  such  a  strong  need  to 

preserve  individual  incentives.  This  is  a  very  unsecure  inference  process,  but  this  may  be 

the  best  one  can  do  to  refine  arbitrary  priors,  and  we  believe  this  is  the  kind  of 

international  comparisons  on  which  a  number  of  observers  "decide"  on  which  side  of  the 

Atlantic  are  we  closer  to  the  social  optimum. 

In  theory,  one  can  say  more  than  that  by  looking  in  more  details  at  the  class  composition 

of  the  electorates  supporting  different  redistributive  policies  (say,  different  parties).  For 

exemple,  if  there  is  a  lot  of  class  polarisation  (i.e.  very  high  partisan  voting  in  each  social 

class),  this  suggests  that  (at  least)  one  class  is  very  far  from  its  socially-optimal  welfare 

level.  In  the  same  way,  very  different  most-preferred  policies  (i.e.  main  political  parties 

advocating  very  different  rates)  suggest  that  (at  least)  some  dynasties  have  got  it  all  wrong. 

Assume  now  we  observe  very  different  policy  proposals,  but  very  little  class  polarisation. 

This  suggests  that  very  different  effort  levels  do  not  have  a  major  effect  on  individual 
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achievements,   and   therefore  that  the  truly  socially-optimal   policy  involves  a  lot  of 

redistribution  and  that  those  working  the  most  should  slow  down.  Similarly,  substantial 

class  polarisation  around  comparable  policy  proposals  indicate  individual  factors  are  the 

key  to  success  and  that  the  social  optimum  involves  little  redistribution.  This  analysis  of 

class  polarization  of  electorates  vs  polarization  of  the  political  spectrum  can  also  be 

conducted  at  the  cross-countiy  level.  One  would  have  to  look  at  this  data  in  more  details, 

but  there  does  not  seem  any  striking  disimilarity  across  western  countries  from  which 

information  could  be  inferred.  In  any  case,  these  are  again  very  approximate  ways  to  infer 

some  information,  but  these  may  be  the  best  ones  available  given  what  we  want  to  learn. 


Section  6  :  Concluding  Comments. 

This  paper  has  two  main  objectives.  First,  providing  some  theoretical  foundations  to 
understand  better  the  political  economy  of  redistribution  and  particularly  some  important 
stylised  facts  concerning  the  effect  of  social  mobility  on  political  attitudes  toward 
redistribution  (namely,  the  fact  that  voters  with  identical  incomes  but  different  social 
origins  vote  differently).  This  gives  a  richer  picture  of  redistributive  politics  than  the 
standard  median-voter  model  (wich  cannot  account  for  these  stylised  facts).  We  believe  that 
our  theory  also  provides  a  tractable  framework  to  analyze  the  fluctuations  of  redistributive 
politics,  e.g.  one  that  can  be  used  to  look  at  the  effects  of  changes  in  the  pre-tax  distribution 
on  redistributive  policies  (for  example,  we  have  not  analyzed  how  shocks  to  fundamentals 
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determine  transitions  between  steady-states). 

Next,  this  paper  suggests  that  instead  of  always  looking  at  politics  as  a  game  of 
conflicting-interests  aggregation,  it  may  he  sometime  valuable  to  consider  that  the  main 
difference  between  voters  is  not  their  differing  interests  and  objective  functions  but  rather 
the  information  and  ideas  about  policies  that  they  have  been  exposed  to  during  their  social 
life.  Not  only  the  m^ority  rule  is  ill-suited  to  aggregate  conflicting  interests  (see,  e.g., 
m^'ority  cycles),  but  differing  beliefs  and  ideas  about  government  intervention  in  the 
economy  are  pervasive  (not  only  among  economists).  The  point  is  that  although  people  can 
have  different  beliefs  about  the  best-possible  policy,  these  beliefs  are  not  arbitrary:  agents 
are  naturally  exposed  to  different  pieces  of  information  depending  on  their  economic 
position.  We  hope  this  general  approach  can  be  tractable  and  rewarding  enough  to  solve 
interesting  political-economy  questions  in  the  future^^ 


Appendix. 

Proof  of  proposition  2. 

We  first  define  formally  the  notion  of  stability  that  we're  using  (we  restrict  formal  notations 
to  the  case  of  single-point  beliefs  (Dirac  measures),  but  everything  can  be  readily  extented 
to  non-single-point  stationary  beliefs  by  replacing  them  by  their  averages). 


''^For  example,  consider  the  unemployment  model  with  voting  over  firing  costs  of  Saint- 
Paul(1993):  unlike  in  Saint-Paul's  theory,  it  may  well  be  that  there  exists  some  socially- 
optimal,  positive  firing  costs  depending  on  how  much  employers  internalize  the  human- 
capital  social  costs  of  firing;  in  such  a  case,  it  may  be  reasonnable  to  expect  that  various 
employment  histories  lead  to  various  informational  exposures  regarding  employers' 
"excessive"  propensity  to  fire,  leading  to  different  political  attitudes  and  possibly  important 
positive  and  normative  implications. 
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Consider  some  stationary  belief  1(^,1  e)  and  some  stationary  tax  rate  r,  leading  to  some 
effort  level  e(6,T)  and  to  some  expected  mobility  probabilities  (iro+ee(T,e),i»'i+ee(T,e)). 
Assume  (7rQ,)ri,6)  is  not  on  A(t).  Then  we  prove  that  any  (arbitrarily)  small  pertubation  in 
the  direction  of  A(t)  will  remove  this  belief  with  positive  probability  (see  figure  1).  In  that 
sense  only  beliefs  located  on  A(t)  can  form  stable  steady-states,  in  the  sense  that  constantly 
pertubated  beliefs  always  tend  to  come  back  there. 

Assume  for  example  that  (iro,iri,6)  is  above  A(t)  (see  figure  1),  i.e.  that 
iro+ee(T,e)>iro*+6*e(T,e)  (the  expected  upward  mobility  probability  is  higher  than  the 
true  probability).  Consider  any  other  parameters  (Wq^it^^Q')  predicting  a  lower  probability 
of  upward  mobility  than  (itq^itj^O)  (for  example  the  parameters  of  A(t);  see  figure  1),  i.e. 
such  that 

itq + ee(  T  ,6)  >  ttq' +  e'e(  T  ,6) 
Consider  the  dynastic  learning  process  starting  with  beliefs  Mio='(l-Zo)l(TO,Ti,6)''' 
Zol(rt  ,1 ,9)>  where  Zq  is  some  arbitrarily  small  number.  We  prove  that  the  long-run  beliefs 
Mi«=l(^,i8)  with  probability  0.  We  note  e(z)  =  e(T,(l-z)9+z9').  Then  transition  rules  for 
(z,),>o  are  given  by 

z.+i*  =  «+6'e(z,))/[(l-z,)(?ro+ee(z,))+z,(V+0'e(z,))]  z, 
if  dynasty  i  observes  upward  mobility 

z,.r  =  (l-V-0'e(Zt))/[(l-z,)(l-?ro-ee(z,))+z,(l-iro'-e'e(z,))]z, 
if  dynasty  i  observes  stability  at  low  income 
Since  iro+ee(T,9)>iro'+e'e(T,6),  z,^i<z,<z,+i*.  For  z,  sufficientely  small, 
(l-z,)(?ro  +  ee(z,))+z,(iro'+6'e(z,))<iro*  +  e*e(z,)    (by   continuity).   It   follows   that   for  z, 
sufficientely  small,  Ez,+i  =  (7ro*+e*e(z,))z,^.i*+(l-7ro*-e*e(Zt))^t+i'>2:,  (Ez,+i  is  the  "true" 
expectation  of  z^^j,  as  opposed  to  E(Z(^JZt)  which  by  deflnition  is  equal  to  zj.  Finally, 
Ez,^j  >z,  for  z^  sufficientely  small  implies  that  Ez,  converges  to  a  strictly  positive  limit,  and 
therefore  that  z,  cannot  converge  to  0  with  probability  1.  One  can  prove  in  the  same  way 
the  instability  of  any  iiTQ,w^,d)  located  below  A(t).  CCJFD. 
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